A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures
نویسندگان
چکیده
Recent advances in sequencing technologies have enabled the production of massive amounts of data on somatic mutations from cancer genomes. These data have led to the detection of characteristic patterns of somatic mutations or "mutation signatures" at an unprecedented resolution, with the potential for new insights into the causes and mechanisms of tumorigenesis. Here we present new methods for modelling, identifying and visualizing such mutation signatures. Our methods greatly simplify mutation signature models compared with existing approaches, reducing the number of parameters by orders of magnitude even while increasing the contextual factors (e.g. the number of flanking bases) that are accounted for. This improves both sensitivity and robustness of inferred signatures. We also provide a new intuitive way to visualize the signatures, analogous to the use of sequence logos to visualize transcription factor binding sites. We illustrate our new method on somatic mutation data from urothelial carcinoma of the upper urinary tract, and a larger dataset from 30 diverse cancer types. The results illustrate several important features of our methods, including the ability of our new visualization tool to clearly highlight the key features of each signature, the improved robustness of signature inferences from small sample sizes, and more detailed inference of signature characteristics such as strand biases and sequence context effects at the base two positions 5' to the mutated site. The overall framework of our work is based on probabilistic models that are closely connected with "mixed-membership models" which are widely used in population genetic admixture analysis, and in machine learning for document clustering. We argue that recognizing these relationships should help improve understanding of mutation signature extraction problems, and suggests ways to further improve the statistical methods. Our methods are implemented in an R package pmsignature (https://github.com/friend1ws/pmsignature) and a web application available at https://friend1ws.shinyapps.io/pmsignature_shiny/.
منابع مشابه
Exploring Gene Signatures in Different Molecular Subtypes of Gastric Cancer (MSS/ TP53+, MSS/TP53-): A Network-based and Machine Learning Approach
Gastric cancer (GC) is one of the leading causes of cancer mortality, worldwide. Molecular understanding of GC’s different subtypes is still dismal and it is necessary to develop new subtype-specific diagnostic and therapeutic approaches. Therefore developing comprehensive research in this area is demanding to have a deeper insight into molecular processes, underlying these subtypes. In this st...
متن کاملBayesian approach to inference of population structure
Methods of inferring the population structure, its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance. In this article, first, motivation and significance of studying the problem of population structure is explained. In the next section, the applications of inference of p...
متن کاملVisualizing the Clusters and Dynamics of HPV Research Area
Purpose: The purpose of the present study is to visualize HPV clusters’ relationships and thematic trends in the world. Methodology: The research type is an applied one with analytical approach and it has been done using co-word analysis. The population of this study consists of articles’ keywords indexed during 2014-2018 in the Web of Science (WoS) in HPV subject area. The total numbers of th...
متن کاملApplying a Simple Model of Cost Effectiveness Study of HPV Vaccine for Iran
HPV vaccine has been recently added to the Iran Drug List, so decision makers need information beyond that available from RCTs to recommend funding for this vaccination. Modeling and economic studies have addressed some of those information needs. We reviewed cost effectiveness studies to find a suitable model for Iranian population to determine the potential cost effectiveness of HPV vaccine p...
متن کاملApplying a Simple Model of Cost Effectiveness Study of HPV Vaccine for Iran
HPV vaccine has been recently added to the Iran Drug List, so decision makers need information beyond that available from RCTs to recommend funding for this vaccination. Modeling and economic studies have addressed some of those information needs. We reviewed cost effectiveness studies to find a suitable model for Iranian population to determine the potential cost effectiveness of HPV vaccine p...
متن کامل